Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology
نویسنده
چکیده
An accurate and generally applicable method for estimating aqueous solubilities for a diverse set of 1297 organic compounds based on multilinear regression and artificial neural network modeling was developed. Molecular connectivity, shape, and atom-type electrotopological state (E-state) indices were used as structural parameters. The data set was divided into a training set of 884 compounds and a randomly chosen test set of 413 compounds. The structural parameters in a 30-12-1 artificial neural network included 24 atom-type E-state indices and six other topological indices, and for the test set, a predictive r2 = 0.92 and s = 0.60 were achieved. With the same parameters the statistics in the multilinear regression were r2 = 0.88 and s = 0.71, respectively.
منابع مشابه
Support Vector Machines for the Estimation of Aqueous Solubility
Support Vector Machines (SVMs) are used to estimate aqueous solubility of organic compounds. A SVM equipped with a Tanimoto similarity kernel estimates solubility with accuracy comparable to results from other reported methods where the same data sets have been studied. Complete cross-validation on a diverse data set resulted in a root-mean-squared error = 0.62 and R(2) = 0.88. The data input t...
متن کاملEstimation of Aqueous Solubility of Chemical Compounds Using E-State Indices
The molecular weight and electrotopological E-state indices were used to estimate by Artificial Neural Networks aqueous solubility for a diverse set of 1291 organic compounds. The neural network with 33-4-1 neurons provided highly predictive results with r(2) = 0.91 and RMS = 0.62. The used parameters included several combinations of E-state indices with similar properties. The calculated resul...
متن کاملApplication of Random Forest and Multiple Linear Regression Techniques to QSPR Prediction of an Aqueous Solubility for Military Compounds.
The relationship between the aqueous solubility of more than two thousand eight hundred organic compounds and their structures was investigated using a QSPR approach based on Simplex Representation of Molecular Structure (SiRMS). The dataset consists of 2537 diverse organic compounds. Multiple Linear Regression (MLR) and Random Forest (RF) methods were used for statistical modeling at the 2D le...
متن کاملPrediction of the pharmaceutical solubility in water and organic solvents via different soft computing models
Solubility data of solid in aqueous and different organic solvents are very important physicochemical properties considered in the design of the industrial processes and the theoretical studies. In this study, experimental solubility data of 666 pharmaceutical compounds in water and 712 pharmaceutical compounds in organic solvents were collected from different sources. Three different artificia...
متن کاملA Fuzzy ARTMAP Based on Quantitative Structure-Property Relationships (QSPRs) for Predicting Aqueous Solubility of Organic Compounds
Quantitative structure-property relationships (QSPRs) for estimating aqueous solubility of organic compounds at 25 degrees C were developed based on a fuzzy ARTMAP and a back-propagation neural networks using a heterogeneous set of 515 organic compounds. A set of molecular descriptors, developed from PM3 semiempirical MO-theory and topological descriptors (first-, second-, third-, and fourth-or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of chemical information and computer sciences
دوره 40 3 شماره
صفحات -
تاریخ انتشار 2000